Deciding the Value 1 Problem for ]-acyclic Partially Observable Markov Decision Processes
نویسندگان
چکیده
The value 1 problem is a natural decision problem in algorithmic game theory. For partially observable Markov decision processes with reachability objective, this problem is defined as follows: are there strategies that achieve the reachability objective with probability arbitrarily close to 1? This problem was shown undecidable recently. Our contribution is to introduce a class of partially observable Markov decision processes, namely ]-acyclic partially observable Markov decision processes, for which the value 1 problem is decidable. Our algorithm is based on the construction of a two-player perfect information game, called the knowledge game, abstracting the behaviour of a ]-acyclic partially observable Markov decision process M such that the first player has a winning strategy in the knowledge game if and only if the value of M is 1.
منابع مشابه
Deciding the Value 1 Problem for $\sharp$ -acyclic Partially Observable Markov Decision Processes
The value 1 problem is a natural decision problem in algorithmic game theory. For partially observable Markov decision processes with reachability objective, this problem is defined as follows: are there observational strategies that achieve the reachability objective with probability arbitrarily close to 1? This problem was shown undecidable recently. Our contribution is to introduce a class o...
متن کاملPOMDPs under Probabilistic Semantics
We consider partially observable Markov decision processes (POMDPs) with limitaverage payoff, where a reward value in the interval [0, 1] is associated to every transition, and the payoff of an infinite path is the long-run average of the rewards. We consider two types of path constraints: (i) quantitative constraint defines the set of paths where the payoff is at least a given threshold λ1 ∈ (...
متن کاملA POMDP Framework to Find Optimal Inspection and Maintenance Policies via Availability and Profit Maximization for Manufacturing Systems
Maintenance can be the factor of either increasing or decreasing system's availability, so it is valuable work to evaluate a maintenance policy from cost and availability point of view, simultaneously and according to decision maker's priorities. This study proposes a Partially Observable Markov Decision Process (POMDP) framework for a partially observable and stochastically deteriorating syste...
متن کاملEecient Dynamic-programming Updates in Partially Observable Markov Decision Processes
We examine the problem of performing exact dynamic-programming updates in partially observable Markov decision processes (pomdps) from a computational complexity viewpoint. Dynamic-programming updates are a crucial operation in a wide range of pomdp solution methods and we nd that it is intractable to perform these updates on piecewise-linear convex value functions for general pomdps. We offer ...
متن کاملOn the Undecidability of Probabilistic Planning and Infinite-Horizon Partially Observable Markov Decision Problems
We investigate the computability of problems in probabilistic planning and partially observable infinite-horizon Markov decision processes. The undecidability of the string-existence problem for probabilistic finite automata is adapted to show that the following problem of plan existence in probabilistic planning is undecidable: given a probabilistic planning problem, determine whether there ex...
متن کامل